0%

(2017) Adversarial Transformation Networks:Learning to Generate Adversarial Examples

Baluja S, Fischer I. Adversarial transformation networks: Learning to generate adversarial examples[J]. arXiv preprint arXiv:1703.09387, 2017.



1. Overview


1.1. Motivation

  • existing methods either directly computing gradients of solving an optimization on the image pixels

In this paper, it proposed Adversarial Transformation Network (ATN)

  • in a self-supervised manner to generate adversarial examples
  • fast to excute




2. Methods


2.1. ATN

  • transform an input into an adversarial example against a target network or set of networks
  • focus on targeted, white-box ATNs



  • f. target network

  • g. parameter vector

2.2. Training



  • L_X. loss function in the input space or perceptual loss
  • L_Y. specially-formed loss on the output space
  • β. weight to balance

2.3. Inference

  • even faster than the single-step gradient-based methods, so long as


2.4. Loss Functions



  • r(*). reranking function


2.5. Reranking Function

  • simplest way. set r(y, t) = onehot(t)



  • α>1. specify how much larger y_t should be than the current max classification